Delay-aware model-based reinforcement learning for continuous control
نویسندگان
چکیده
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using reward process. We develop model-based framework that incorporate multi-step delay learned system models without effort. Experiments Gym MuJoCo platforms show proposed algorithm is more efficient training transferable between systems various durations compared state-of-the-art model-free methods.
منابع مشابه
Value-Aware Loss Function for Model-based Reinforcement Learning
We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, is an overkill because it does not take into account the underlying structure of decision problem and the RL algorithm that intends to solve it. We introdu...
متن کاملReinforcement Learning for Continuous Stochastic Control Problems
This paper is concerned with the problem of Reinforcement Learning (RL) for continuous state space and time stocha.stic control problems. We state the Harnilton-Jacobi-Bellman equation satisfied by the value function and use a Finite-Difference method for designing a convergent approximation scheme. Then we propose a RL algorithm based on this scheme and prove its convergence to the optimal sol...
متن کاملBenchmarking Deep Reinforcement Learning for Continuous Control
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuou...
متن کاملReinforcement Learning Based PID Control of Wind Energy Conversion Systems
In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...
متن کاملEfficient reinforcement learning: model-based Acrobot control
|Several methods have been proposed in the reinforcement learning literature for learning optimal policies for sequential decision tasks. Q-learning is a model-free algorithm that has recently been applied to the Acrobot, a two-link arm with a single actuator at the elbow that learns to swing its free endpoint above a target height. However, applying Q-learning to a real Acrobot may be impracti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2021
ISSN: ['0925-2312', '1872-8286']
DOI: https://doi.org/10.1016/j.neucom.2021.04.015